{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fe03b687-98ba-42de-aec6-29e5e8c9313d",
   "metadata": {},
   "source": [
    "Chapter 09\n",
    "\n",
    "# 施密特正交化，自定义函数\n",
    "《线性代数》 | 鸢尾花书：数学不难"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da190c00-d41b-4b71-9716-c99e19a2beb7",
   "metadata": {},
   "source": [
    "这段代码通过 Gram-Schmidt 过程对一个实数矩阵 $A$ 进行正交化，完成 **QR 分解**：将矩阵 $A$ 分解为正交矩阵 $Q$ 和上三角矩阵 $R$，满足 $A = QR$。这一分解在数值分析、最小二乘法、特征值计算等领域具有重要意义。\n",
    "\n",
    "---\n",
    "\n",
    "首先导入 NumPy，并定义一个函数 `gram_schmidt_qr(A)`，用于执行格拉姆-施密特正交化过程。这个函数适用于任意形状为 $m \\times n$ 的实矩阵 $A$，其中 $m \\geq n$ 且 $A$ 的列向量线性无关。\n",
    "\n",
    "整个正交化过程的核心思想是将原始的 $A$ 按列处理，将每个列向量 $\\boldsymbol{a}_i$ 转换为单位正交向量 $\\boldsymbol{q}_i$，并用这些 $\\boldsymbol{q}_i$ 构成矩阵 $Q$。每列向量的处理分为两个步骤：**去除投影分量（正交化）** 和 **单位化**。\n",
    "\n",
    "---\n",
    "\n",
    "在第 $i$ 步中，我们处理矩阵 $A$ 的第 $i$ 列向量 $\\boldsymbol{a}_i$：\n",
    "\n",
    "1. 先依次从 $\\boldsymbol{a}_i$ 中减去它在先前所有 $\\boldsymbol{q}_j\\ (j < i)$ 上的投影：\n",
    "   $$\n",
    "   \\boldsymbol{a}_i' = \\boldsymbol{a}_i - \\sum_{j=0}^{i-1} R_{ji} \\boldsymbol{q}_j,\\quad\\text{其中}\\ R_{ji} = \\boldsymbol{q}_j^\\top \\boldsymbol{a}_i\n",
    "   $$\n",
    "   这一过程确保了 $\\boldsymbol{a}_i'$ 与所有之前的 $\\boldsymbol{q}_j$ 都正交。\n",
    "\n",
    "2. 然后将这个结果单位化：\n",
    "   $$\n",
    "   \\boldsymbol{q}_i = \\frac{\\boldsymbol{a}_i'}{\\|\\boldsymbol{a}_i'\\|},\\quad R_{ii} = \\|\\boldsymbol{a}_i'\\|\n",
    "   $$\n",
    "   从而得到新的单位正交向量 $\\boldsymbol{q}_i$。\n",
    "\n",
    "最终得到的矩阵 $Q = [\\boldsymbol{q}_1\\ \\boldsymbol{q}_2\\ \\dots\\ \\boldsymbol{q}_n]$ 满足列向量两两正交且范数为 1，即：\n",
    "\n",
    "$$\n",
    "Q^\\top Q = I_n\n",
    "$$\n",
    "\n",
    "而矩阵 $R$ 是上三角矩阵，记录了每个原始向量在正交基上的投影系数。这样就得到了：\n",
    "\n",
    "$$\n",
    "A = QR\n",
    "$$\n",
    "\n",
    "---\n",
    "\n",
    "在本例中，构造了如下 $3 \\times 3$ 的矩阵 $A$：\n",
    "\n",
    "$$\n",
    "A = \\begin{bmatrix}\n",
    "2 & 2 & 0 \\\\\n",
    "2 & 0 & 2 \\\\\n",
    "0 & 2 & 2\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "调用函数 `gram_schmidt_qr(A)` 后，返回的 $Q$ 为列正交矩阵，$R$ 为上三角矩阵。用 `np.round(Q.T @ Q, 6)` 验证 $Q^\\top Q$ 的结果是否为单位矩阵，确保向量组是单位正交组，说明正交化过程成功。\n",
    "\n",
    "---\n",
    "\n",
    "这套流程不仅实现了 QR 分解，还保留了每一步中的投影和归一化结构，有助于理解 QR 的几何意义：每一列是“去掉前面分量并重新归一化”后的新基向量。这个过程也为最小二乘法提供了核心工具。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9cc1a9df-7ea3-4c32-a28a-0e06a3560620",
   "metadata": {},
   "source": [
    "## 初始化"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "71ef2603-2e3a-4bde-ade4-0583a0434426",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb17a82c-9562-4d41-9a5e-d4f072f824bd",
   "metadata": {},
   "source": [
    "## 自定义函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "d909cba1-033a-4f36-b8b3-5ed5cf688cec",
   "metadata": {},
   "outputs": [],
   "source": [
    "def gram_schmidt_qr(A):\n",
    "    \"\"\"\n",
    "    对实数矩阵 A 执行格拉姆-施密特正交化，返回 Q, R，使 A = Q @ R\n",
    "    Q 为单位正交列向量矩阵，R 为上三角矩阵\n",
    "\n",
    "    参数：\n",
    "    A : np.ndarray，形状 (m, n)，m ≥ n\n",
    "\n",
    "    返回：\n",
    "    Q : np.ndarray，形状 (m, n)，单位正交列向量\n",
    "    R : np.ndarray，形状 (n, n)，上三角矩阵\n",
    "    \"\"\"\n",
    "    m, n = A.shape\n",
    "    Q = np.zeros((m, n))\n",
    "    R = np.zeros((n, n))\n",
    "\n",
    "    for i in range(n):\n",
    "        ai = A[:, i].reshape(-1, 1)\n",
    "        for j in range(i):\n",
    "            qj = Q[:, j].reshape(-1, 1)\n",
    "            R[j, i] = (qj.T @ ai).item()\n",
    "            ai = ai - R[j, i] * qj      # 执行正交投影剔除\n",
    "\n",
    "        norm = np.linalg.norm(ai)\n",
    "        if norm < 1e-10:\n",
    "            raise ValueError(f\"第 {i+1} 列向量线性相关，无法正交化\")\n",
    "\n",
    "        Q[:, i] = (ai / norm).flatten()\n",
    "        R[i, i] = norm                # 对角线元素\n",
    "\n",
    "    return Q, R\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f6d5009-14b1-4928-a3c7-f5562c5a69e4",
   "metadata": {},
   "source": [
    "## 创建数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "11326ba4-69fa-4288-b0a1-2606193ddd88",
   "metadata": {},
   "outputs": [],
   "source": [
    "A = np.array([\n",
    "    [2, 2, 0],\n",
    "    [2, 0, 2],\n",
    "    [0, 2, 2]])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "576ef840-3d31-4c92-954f-5c863053668b",
   "metadata": {},
   "source": [
    "## 调用函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "850fe931-d130-4846-a270-229a57fdbe11",
   "metadata": {},
   "outputs": [],
   "source": [
    "Q,R = gram_schmidt_qr(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "6e7d7fde-2693-4402-8074-6d3547e96279",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.70710678,  0.40824829, -0.57735027],\n",
       "       [ 0.70710678, -0.40824829,  0.57735027],\n",
       "       [ 0.        ,  0.81649658,  0.57735027]])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "debeb642-65b7-448f-b417-0070ca0aee33",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[2.82842712, 1.41421356, 1.41421356],\n",
       "       [0.        , 2.44948974, 0.81649658],\n",
       "       [0.        , 0.        , 2.30940108]])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "R"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "03465736-1ab3-47f7-bfcd-ef2840b4632f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  0.,  0.],\n",
       "       [ 0.,  1., -0.],\n",
       "       [ 0., -0.,  1.]])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 验证正交性\n",
    "np.round(Q.T @ Q, 6)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e393dff-3200-49e8-bd4a-116152651905",
   "metadata": {},
   "source": [
    "作者\t**生姜DrGinger**  \n",
    "脚本\t**生姜DrGinger**  \n",
    "视频\t**崔崔CuiCui**  \n",
    "开源资源\t[**GitHub**](https://github.com/Visualize-ML)  \n",
    "平台\t[**油管**](https://www.youtube.com/@DrGinger_Jiang)\t\t\n",
    "\t\t[**iris小课堂**](https://space.bilibili.com/3546865719052873)\t\t\n",
    "\t\t[**生姜DrGinger**](https://space.bilibili.com/513194466)  "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:base] *",
   "language": "python",
   "name": "conda-base-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
